Overview

Dataset statistics

Number of variables40
Number of observations26990
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.7 MiB
Average record size in memory104.0 B

Variable types

Numeric7
Categorical1
Boolean32

Alerts

is_starrable_False has constant value "True" Constant
location_state_England is highly correlated with country_GB and 1 other fieldsHigh correlation
country_GB is highly correlated with location_state_England and 1 other fieldsHigh correlation
country_US is highly correlated with location_state_England and 1 other fieldsHigh correlation
location_state_England is highly correlated with country_GB and 1 other fieldsHigh correlation
country_GB is highly correlated with location_state_England and 1 other fieldsHigh correlation
country_US is highly correlated with location_state_England and 1 other fieldsHigh correlation
location_state_England is highly correlated with country_GB and 1 other fieldsHigh correlation
country_GB is highly correlated with location_state_England and 1 other fieldsHigh correlation
country_US is highly correlated with location_state_England and 1 other fieldsHigh correlation
category_Other is highly correlated with is_starrable_FalseHigh correlation
country_AU is highly correlated with is_starrable_FalseHigh correlation
category_Performances is highly correlated with is_starrable_FalseHigh correlation
category_Poetry is highly correlated with is_starrable_FalseHigh correlation
location_state_PA is highly correlated with is_starrable_FalseHigh correlation
country_CA is highly correlated with is_starrable_FalseHigh correlation
location_state_MA is highly correlated with is_starrable_FalseHigh correlation
location_state_CA is highly correlated with is_starrable_FalseHigh correlation
country_IT is highly correlated with is_starrable_FalseHigh correlation
staff_pick_True is highly correlated with is_starrable_FalseHigh correlation
country_MX is highly correlated with is_starrable_FalseHigh correlation
country_ES is highly correlated with is_starrable_FalseHigh correlation
location_state_England is highly correlated with country_GB and 2 other fieldsHigh correlation
location_state_IL is highly correlated with is_starrable_FalseHigh correlation
country_GB is highly correlated with location_state_England and 2 other fieldsHigh correlation
category_Jewelry is highly correlated with is_starrable_FalseHigh correlation
location_state_Other is highly correlated with is_starrable_FalseHigh correlation
location_state_TX is highly correlated with is_starrable_FalseHigh correlation
country_DE is highly correlated with is_starrable_FalseHigh correlation
category_Graphic Novels is highly correlated with is_starrable_FalseHigh correlation
category_Wearables is highly correlated with is_starrable_FalseHigh correlation
location_state_WA is highly correlated with is_starrable_FalseHigh correlation
country_Other is highly correlated with is_starrable_FalseHigh correlation
category_Narrative Film is highly correlated with is_starrable_FalseHigh correlation
country_US is highly correlated with location_state_England and 2 other fieldsHigh correlation
category_Tabletop Games is highly correlated with is_starrable_FalseHigh correlation
location_state_NY is highly correlated with is_starrable_FalseHigh correlation
is_starrable_False is highly correlated with category_Other and 31 other fieldsHigh correlation
category_Dance is highly correlated with is_starrable_FalseHigh correlation
location_state_FL is highly correlated with is_starrable_FalseHigh correlation
category_Classical Music is highly correlated with is_starrable_FalseHigh correlation
country_FR is highly correlated with is_starrable_FalseHigh correlation
state is highly correlated with is_starrable_FalseHigh correlation
location_state_CA is highly correlated with location_state_OtherHigh correlation
location_state_England is highly correlated with country_GB and 1 other fieldsHigh correlation
location_state_Other is highly correlated with location_state_CAHigh correlation
country_CA is highly correlated with country_USHigh correlation
country_GB is highly correlated with location_state_England and 1 other fieldsHigh correlation
country_US is highly correlated with location_state_England and 2 other fieldsHigh correlation
goal is highly skewed (γ1 = 61.00201993) Skewed
id has unique values Unique
prep_time has 2948 (10.9%) zeros Zeros
weekday_of_launch has 4659 (17.3%) zeros Zeros
hour_of_launch has 1045 (3.9%) zeros Zeros

Reproduction

Analysis started2022-05-18 20:35:53.452982
Analysis finished2022-05-18 20:36:18.171706
Duration24.72 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

goal
Real number (ℝ≥0)

SKEWED

Distinct1567
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29512.47891
Minimum0.01
Maximum50000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size421.7 KiB
2022-05-18T16:36:18.263709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile250
Q11000
median3500
Q310000
95-th percentile50000
Maximum50000000
Range49999999.99
Interquartile range (IQR)9000

Descriptive statistics

Standard deviation504033.4892
Coefficient of variation (CV)17.07865648
Kurtosis4807.466613
Mean29512.47891
Median Absolute Deviation (MAD)3000
Skewness61.00201993
Sum796541805.9
Variance2.540497583 × 1011
MonotonicityNot monotonic
2022-05-18T16:36:18.481738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50001747
 
6.5%
10001325
 
4.9%
100001315
 
4.9%
5001266
 
4.7%
30001208
 
4.5%
20001170
 
4.3%
1500831
 
3.1%
2500830
 
3.1%
15000755
 
2.8%
20000631
 
2.3%
Other values (1557)15912
59.0%
ValueCountFrequency (%)
0.011
 
< 0.1%
159
0.2%
21
 
< 0.1%
31
 
< 0.1%
42
 
< 0.1%
58
 
< 0.1%
71
 
< 0.1%
83
 
< 0.1%
1047
0.2%
124
 
< 0.1%
ValueCountFrequency (%)
500000001
 
< 0.1%
330000001
 
< 0.1%
250000001
 
< 0.1%
200000003
< 0.1%
100000007
< 0.1%
90000002
 
< 0.1%
75000001
 
< 0.1%
73000001
 
< 0.1%
65000011
 
< 0.1%
60000001
 
< 0.1%

id
Real number (ℝ≥0)

UNIQUE

Distinct26990
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1067811742
Minimum53154
Maximum2147466649
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size421.7 KiB
2022-05-18T16:36:18.691736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum53154
5-th percentile103745519.9
Q1523759718.2
median1064504786
Q31608616310
95-th percentile2038007840
Maximum2147466649
Range2147413495
Interquartile range (IQR)1084856591

Descriptive statistics

Standard deviation623465831.2
Coefficient of variation (CV)0.583872425
Kurtosis-1.219457615
Mean1067811742
Median Absolute Deviation (MAD)541931476
Skewness0.007666229066
Sum2.882023891 × 1013
Variance3.887096427 × 1017
MonotonicityNot monotonic
2022-05-18T16:36:18.896712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5220813921
 
< 0.1%
9794928371
 
< 0.1%
6008354061
 
< 0.1%
17334897391
 
< 0.1%
5709402471
 
< 0.1%
6709385211
 
< 0.1%
20849706991
 
< 0.1%
18159286571
 
< 0.1%
16359094221
 
< 0.1%
3409418971
 
< 0.1%
Other values (26980)26980
> 99.9%
ValueCountFrequency (%)
531541
< 0.1%
1132301
< 0.1%
1278001
< 0.1%
1711161
< 0.1%
2748651
< 0.1%
2855831
< 0.1%
3258751
< 0.1%
3798731
< 0.1%
7870661
< 0.1%
9117121
< 0.1%
ValueCountFrequency (%)
21474666491
< 0.1%
21474601191
< 0.1%
21474305991
< 0.1%
21474167471
< 0.1%
21473803161
< 0.1%
21473647811
< 0.1%
21473394831
< 0.1%
21473367471
< 0.1%
21471805461
< 0.1%
21470347661
< 0.1%

state
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size421.7 KiB
successful
19191 
failed
7799 

Length

Max length10
Median length10
Mean length8.844164505
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsuccessful
2nd rowsuccessful
3rd rowsuccessful
4th rowsuccessful
5th rowsuccessful

Common Values

ValueCountFrequency (%)
successful19191
71.1%
failed7799
28.9%

Length

2022-05-18T16:36:19.019900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-18T16:36:19.092913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
successful19191
71.1%
failed7799
28.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

campaign_length
Real number (ℝ≥0)

Distinct85
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.45075954
Minimum1
Maximum91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size421.7 KiB
2022-05-18T16:36:19.239930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile14
Q128
median30
Q333
95-th percentile60
Maximum91
Range90
Interquartile range (IQR)5

Descriptive statistics

Standard deviation12.00175685
Coefficient of variation (CV)0.3816046743
Kurtosis1.839634985
Mean31.45075954
Median Absolute Deviation (MAD)2
Skewness0.9733681399
Sum848856
Variance144.0421675
MonotonicityNot monotonic
2022-05-18T16:36:19.477622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3010046
37.2%
291642
 
6.1%
601482
 
5.5%
31866
 
3.2%
21785
 
2.9%
45764
 
2.8%
14745
 
2.8%
20691
 
2.6%
28621
 
2.3%
35600
 
2.2%
Other values (75)8748
32.4%
ValueCountFrequency (%)
119
 
0.1%
210
 
< 0.1%
325
 
0.1%
420
 
0.1%
572
 
0.3%
669
 
0.3%
7192
0.7%
855
 
0.2%
978
 
0.3%
10200
0.7%
ValueCountFrequency (%)
911
 
< 0.1%
9023
0.1%
8921
0.1%
888
 
< 0.1%
862
 
< 0.1%
841
 
< 0.1%
832
 
< 0.1%
822
 
< 0.1%
812
 
< 0.1%
803
 
< 0.1%

prep_time
Real number (ℝ≥0)

ZEROS

Distinct769
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.58947758
Minimum0
Maximum3318
Zeros2948
Zeros (%)10.9%
Negative0
Negative (%)0.0%
Memory size421.7 KiB
2022-05-18T16:36:19.641387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median13
Q337
95-th percentile192
Maximum3318
Range3318
Interquartile range (IQR)34

Descriptive statistics

Standard deviation136.3179994
Coefficient of variation (CV)2.864456731
Kurtosis128.901424
Mean47.58947758
Median Absolute Deviation (MAD)12
Skewness9.231220255
Sum1284440
Variance18582.59696
MonotonicityNot monotonic
2022-05-18T16:36:19.996374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02948
 
10.9%
11457
 
5.4%
21255
 
4.6%
31140
 
4.2%
4985
 
3.6%
5929
 
3.4%
6881
 
3.3%
7810
 
3.0%
8723
 
2.7%
9632
 
2.3%
Other values (759)15230
56.4%
ValueCountFrequency (%)
02948
10.9%
11457
5.4%
21255
4.6%
31140
 
4.2%
4985
 
3.6%
5929
 
3.4%
6881
 
3.3%
7810
 
3.0%
8723
 
2.7%
9632
 
2.3%
ValueCountFrequency (%)
33181
< 0.1%
33031
< 0.1%
32501
< 0.1%
30461
< 0.1%
28151
< 0.1%
26851
< 0.1%
26331
< 0.1%
25171
< 0.1%
24321
< 0.1%
24001
< 0.1%

month_of_launch
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.087106336
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size421.7 KiB
2022-05-18T16:36:20.126364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.395907488
Coefficient of variation (CV)0.5578853565
Kurtosis-1.235508877
Mean6.087106336
Median Absolute Deviation (MAD)3
Skewness0.1725523548
Sum164291
Variance11.53218767
MonotonicityNot monotonic
2022-05-18T16:36:20.247357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
32901
10.7%
22852
10.6%
42726
10.1%
52552
9.5%
102339
8.7%
112182
8.1%
12121
7.9%
92098
7.8%
72029
7.5%
61897
7.0%
Other values (2)3293
12.2%
ValueCountFrequency (%)
12121
7.9%
22852
10.6%
32901
10.7%
42726
10.1%
52552
9.5%
61897
7.0%
72029
7.5%
81819
6.7%
92098
7.8%
102339
8.7%
ValueCountFrequency (%)
121474
5.5%
112182
8.1%
102339
8.7%
92098
7.8%
81819
6.7%
72029
7.5%
61897
7.0%
52552
9.5%
42726
10.1%
32901
10.7%

weekday_of_launch
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.386031864
Minimum0
Maximum6
Zeros4659
Zeros (%)17.3%
Negative0
Negative (%)0.0%
Memory size421.7 KiB
2022-05-18T16:36:20.326362image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q34
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.797575302
Coefficient of variation (CV)0.7533743908
Kurtosis-0.90589679
Mean2.386031864
Median Absolute Deviation (MAD)1
Skewness0.3676106292
Sum64399
Variance3.231276965
MonotonicityNot monotonic
2022-05-18T16:36:20.396355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
15747
21.3%
04659
17.3%
24538
16.8%
34160
15.4%
44026
14.9%
52168
 
8.0%
61692
 
6.3%
ValueCountFrequency (%)
04659
17.3%
15747
21.3%
24538
16.8%
34160
15.4%
44026
14.9%
52168
 
8.0%
61692
 
6.3%
ValueCountFrequency (%)
61692
 
6.3%
52168
 
8.0%
44026
14.9%
34160
15.4%
24538
16.8%
15747
21.3%
04659
17.3%

hour_of_launch
Real number (ℝ≥0)

ZEROS

Distinct24
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.67021119
Minimum0
Maximum23
Zeros1045
Zeros (%)3.9%
Negative0
Negative (%)0.0%
Memory size421.7 KiB
2022-05-18T16:36:20.483364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q19
median15
Q319
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.68715033
Coefficient of variation (CV)0.4891768121
Kurtosis-0.7088038079
Mean13.67021119
Median Absolute Deviation (MAD)4
Skewness-0.6383662374
Sum368959
Variance44.71797954
MonotonicityNot monotonic
2022-05-18T16:36:20.572362image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
162104
 
7.8%
172042
 
7.6%
151896
 
7.0%
181803
 
6.7%
191711
 
6.3%
141581
 
5.9%
201513
 
5.6%
211423
 
5.3%
221370
 
5.1%
131229
 
4.6%
Other values (14)10318
38.2%
ValueCountFrequency (%)
01045
3.9%
1941
3.5%
2851
3.2%
3719
2.7%
4711
2.6%
5591
2.2%
6452
1.7%
7525
1.9%
8524
1.9%
9512
1.9%
ValueCountFrequency (%)
231220
4.5%
221370
5.1%
211423
5.3%
201513
5.6%
191711
6.3%
181803
6.7%
172042
7.6%
162104
7.8%
151896
7.0%
141581
5.9%

location_state_CA
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
23548 
True
3442 
ValueCountFrequency (%)
False23548
87.2%
True3442
 
12.8%
2022-05-18T16:36:20.635061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_England
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
24315 
True
2675 
ValueCountFrequency (%)
False24315
90.1%
True2675
 
9.9%
2022-05-18T16:36:20.669064image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_FL
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26166 
True
 
824
ValueCountFrequency (%)
False26166
96.9%
True824
 
3.1%
2022-05-18T16:36:20.700045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_IL
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26301 
True
 
689
ValueCountFrequency (%)
False26301
97.4%
True689
 
2.6%
2022-05-18T16:36:20.751039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_MA
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26403 
True
 
587
ValueCountFrequency (%)
False26403
97.8%
True587
 
2.2%
2022-05-18T16:36:20.818043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_NY
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
24262 
True
2728 
ValueCountFrequency (%)
False24262
89.9%
True2728
 
10.1%
2022-05-18T16:36:20.867035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_Other
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
13668 
True
13322 
ValueCountFrequency (%)
False13668
50.6%
True13322
49.4%
2022-05-18T16:36:20.895038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_PA
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26401 
True
 
589
ValueCountFrequency (%)
False26401
97.8%
True589
 
2.2%
2022-05-18T16:36:20.930040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_TX
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26003 
True
 
987
ValueCountFrequency (%)
False26003
96.3%
True987
 
3.7%
2022-05-18T16:36:20.972039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

location_state_WA
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26360 
True
 
630
ValueCountFrequency (%)
False26360
97.7%
True630
 
2.3%
2022-05-18T16:36:21.009063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_AU
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26445 
True
 
545
ValueCountFrequency (%)
False26445
98.0%
True545
 
2.0%
2022-05-18T16:36:21.041064image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_CA
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25752 
True
 
1238
ValueCountFrequency (%)
False25752
95.4%
True1238
 
4.6%
2022-05-18T16:36:21.068057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_DE
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26513 
True
 
477
ValueCountFrequency (%)
False26513
98.2%
True477
 
1.8%
2022-05-18T16:36:21.095084image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_ES
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26677 
True
 
313
ValueCountFrequency (%)
False26677
98.8%
True313
 
1.2%
2022-05-18T16:36:21.128080image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_FR
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26566 
True
 
424
ValueCountFrequency (%)
False26566
98.4%
True424
 
1.6%
2022-05-18T16:36:21.157077image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_GB
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
23896 
True
3094 
ValueCountFrequency (%)
False23896
88.5%
True3094
 
11.5%
2022-05-18T16:36:21.187062image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_IT
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26622 
True
 
368
ValueCountFrequency (%)
False26622
98.6%
True368
 
1.4%
2022-05-18T16:36:21.243091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_MX
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
26702 
True
 
288
ValueCountFrequency (%)
False26702
98.9%
True288
 
1.1%
2022-05-18T16:36:21.327081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_Other
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25903 
True
 
1087
ValueCountFrequency (%)
False25903
96.0%
True1087
 
4.0%
2022-05-18T16:36:21.378580image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

country_US
Boolean

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
True
18930 
False
8060 
ValueCountFrequency (%)
True18930
70.1%
False8060
29.9%
2022-05-18T16:36:21.439582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Classical Music
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
24602 
True
 
2388
ValueCountFrequency (%)
False24602
91.2%
True2388
 
8.8%
2022-05-18T16:36:21.474564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Dance
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25623 
True
 
1367
ValueCountFrequency (%)
False25623
94.9%
True1367
 
5.1%
2022-05-18T16:36:21.503581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Graphic Novels
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
24652 
True
 
2338
ValueCountFrequency (%)
False24652
91.3%
True2338
 
8.7%
2022-05-18T16:36:21.533578image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Jewelry
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25267 
True
 
1723
ValueCountFrequency (%)
False25267
93.6%
True1723
 
6.4%
2022-05-18T16:36:21.579583image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Narrative Film
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
24599 
True
 
2391
ValueCountFrequency (%)
False24599
91.1%
True2391
 
8.9%
2022-05-18T16:36:21.609565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Other
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
17154 
True
9836 
ValueCountFrequency (%)
False17154
63.6%
True9836
36.4%
2022-05-18T16:36:21.647564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Performances
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25766 
True
 
1224
ValueCountFrequency (%)
False25766
95.5%
True1224
 
4.5%
2022-05-18T16:36:21.696557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Poetry
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25334 
True
 
1656
ValueCountFrequency (%)
False25334
93.9%
True1656
 
6.1%
2022-05-18T16:36:21.731807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Tabletop Games
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25638 
True
 
1352
ValueCountFrequency (%)
False25638
95.0%
True1352
 
5.0%
2022-05-18T16:36:21.760269image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

category_Wearables
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
25460 
True
 
1530
ValueCountFrequency (%)
False25460
94.3%
True1530
 
5.7%
2022-05-18T16:36:21.788250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

staff_pick_True
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
False
22390 
True
4600 
ValueCountFrequency (%)
False22390
83.0%
True4600
 
17.0%
2022-05-18T16:36:21.816257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

is_starrable_False
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size237.2 KiB
True
26990 
ValueCountFrequency (%)
True26990
100.0%
2022-05-18T16:36:21.842268image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Interactions

2022-05-18T16:36:15.490210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:09.161430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:10.245252image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.203060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:12.252348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.325346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:14.474205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:15.670947image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:09.306047image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:10.349253image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.355064image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:12.378344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.494334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:14.611203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:15.787237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:09.467062image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:10.506440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.469062image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:12.507343image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.590383image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:14.723207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:15.946234image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:09.609039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:10.652059image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.671378image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:12.651358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.689543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:14.845212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:16.116231image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:09.736243image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:10.851060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.850347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:12.903344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.874821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:15.014207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:16.221260image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:09.866262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.007051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.965348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.043343image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.991822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:15.179205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:16.331226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:10.070251image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:11.109065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:12.099346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:13.190345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:14.306204image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-18T16:36:15.382198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-05-18T16:36:22.179329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-18T16:36:22.765898image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-18T16:36:23.305868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-18T16:36:23.741207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-05-18T16:36:24.147213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-18T16:36:16.621900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-18T16:36:17.787720image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

goalidstatecampaign_lengthprep_timemonth_of_launchweekday_of_launchhour_of_launchlocation_state_CAlocation_state_Englandlocation_state_FLlocation_state_ILlocation_state_MAlocation_state_NYlocation_state_Otherlocation_state_PAlocation_state_TXlocation_state_WAcountry_AUcountry_CAcountry_DEcountry_EScountry_FRcountry_GBcountry_ITcountry_MXcountry_Othercountry_UScategory_Classical Musiccategory_Dancecategory_Graphic Novelscategory_Jewelrycategory_Narrative Filmcategory_Othercategory_Performancescategory_Poetrycategory_Tabletop Gamescategory_Wearablesstaff_pick_Trueis_starrable_False
010000.0522081392successful49172620TrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseTrue
13000.0688419156successful30611318FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrue
220000.01395612011successful29102113FalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrueTrue
33000.01895670076successful2973413FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseTrue
43000.0273779926successful305710315FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrue
530000.01905826891successful31214215FalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrueTrue
670000.01606387274failed311466019FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrue
75000.01461979375successful151262116FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseTrue
824953.01351145783failed26321123FalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrueTrue
92000.01492834939successful304411522TrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrue

Last rows

goalidstatecampaign_lengthprep_timemonth_of_launchweekday_of_launchhour_of_launchlocation_state_CAlocation_state_Englandlocation_state_FLlocation_state_ILlocation_state_MAlocation_state_NYlocation_state_Otherlocation_state_PAlocation_state_TXlocation_state_WAcountry_AUcountry_CAcountry_DEcountry_EScountry_FRcountry_GBcountry_ITcountry_MXcountry_Othercountry_UScategory_Classical Musiccategory_Dancecategory_Graphic Novelscategory_Jewelrycategory_Narrative Filmcategory_Othercategory_Performancescategory_Poetrycategory_Tabletop Gamescategory_Wearablesstaff_pick_Trueis_starrable_False
2698010000.0693897183successful50201124TrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrue
26981500.02028386415failed290317FalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrue
269821500.0645573052successful29443212FalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrue
269838000.0274524291successful5057711FalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseTrue
2698422000.0554027675successful331410116TrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseTrueTrue
2698530000.0457952399failed35302312TrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseTrue
269861000.02127789307successful3378120FalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseTrue
2698715000.01362143084failed2909113FalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseTrue
269881100.01674716636failed6001133FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseTrue
269894000.02114005133successful4133307FalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseTrue